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Abstract 

Integrated tempering sampling (ITS) method is an approach to enhance the 
sampling over a broad range of energies and temperatures in computer sim- 
ulations. In this paper, a new version of integrated tempering sampling 
method is proposed. In the new approach presented here, we obtain pa- 
rameters such as the set of temperatures and the corresponding weighting 
factors from canonical average of potential energies. These parameters can 
be easily obtained without estimating partition functions. We apply this new 
approach to study the Lennard- Jones fluid, the ALA-PRO peptide and the 
single polymer chain systems to validate and benchmark the method. 
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1. Introduction 

Because the free energy landscape of typical macromolecular system is 
rough and complicated with plenty of minima and barriers, it is difficult 
to search global free energy minimum using conventional molecule dynam- 
ics (MD) or Monte Carlo (MC) simulations. In the last decades, a variety 
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of methods have been developed to achieve an extensive sampling of con- 
figurational space. These methods include umbrella sampling |2j, replica 
exchange method(REM) 0, [3, multicanonical simulation metadynam- 
ics 0], simulated tempering jl, essential dynamic sampling fiolj. Wang- 
Landau algorithm [ill, 13] , temperature accelerated sampling Gjl, and so 
on. Many of these methods are based on generalized ensemble [14J in which 
each configuration is weighted by a non-Boltzmann probability factor, and 
thus a random walk in energy space could be achieved, i.e. via a multi- 
canonical method. These generalized ensemble methods have been exten- 



sively applied to the studies of, for example, spin glass [15| and protein 



folding [16|, [17|. However, the non-Boltzmann probability factor is usually 
unknown and should be determined by iteration processes. These iterations 
are non-trivial and even difficult for complex systems, therefore some meth- 



ods are proposed to accelerate the convergence of iteration processes [18|, [19 



Recently, an integrated tempering sampling (ITS) method for enhancing 



the sampling in energy and configuration space was proposed [20|, |21|, [22 
This method is based on a generalized (non-Boltzmann) ensemble which 
allows an enhanced sampling in a desired broad energy and temperature 
range. In this generalized ensemble, the probability of a configuration of the 
system under study is proportional to a summation of Boltzmann factors at a 
set of temperatures, with each Boltzmann factor carrying a weighting factor. 
These weighting factors can be determined by the condition that each term 
in the summation contributes a predefined fraction. 

In the original ITS method, the weighting factors can be estimated in an 
iterative way, which may be time-consuming for large systems. In this study, 
we follow the line of ITS method and derive the expression of weighting 
factors through optimizing the energy distribution in the simulations. The 
values of weighting factors only depend on the average potential energy of the 
system, which do not have to be very accurate and can be easily calculated 
by conventional MD or MC simulations. This process avoids iteration, so 
the weighting factors can be determined easily and quickly. Moreover, the 
temperature distribution of an ITS simulation is very important. A broad 
energy distribution cannot be generated unless a proper temperature range 
is chosen. Here we also propose an easy-to-use way to generate temperature 
distribution that can ensure a reasonable energy distribution. 

This paper is organized as follows. In section [21 the theory and computa- 
tional scheme will be described in detail. In section [31 we apply the method to 
the studies of Lennard- Jones fluid, a small peptide and single polymer chain 
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to validate and benchmark the method. Conclusion is drawn in section |H 



2. Method 

2.1. Generalized ensemble 

ITS method is based on the generalized ensemble to get a distribution 
covering a broad range of energies. We define the generalized distribution 
function W(r) as a summation of a set of Boltzmann factors at different 
temperatures T k : 

W(r) = J2 n ke~ PkU{r) k = l,2,...,N. (1) 
k 

In Eq. (JTJ), 0k = 1/ksTk, ks is Boltzmann constant. In this study, we assume 
that all the terms in the summation are ranked as temperature increases. The 
probability to find a configuration with potential energy U is proportional 
to W(r). Eq. ([TJ shows that the generalized ensemble is closely associated 
with the canonical ensembles at different temperatures. The properties of 
the generalized ensemble can be calculated from those canonical ensembles. 
For example, the partition function is: 

Qw = j W(r)dr =[Y1 n k e-^ u{r) dr = £ n k Q k . (2) 

k k 

In Eq. (j2J), Qw is the partition function of the generalized ensemble and Qj~ 
is the partition function of the canonical ensemble at temperature T k . The 
ensemble average of thermodynamic quantity (A)w is 

... MrWr)* J E^W. 

[ > w }W(r)dr /£«,£-»>«* Y,n k Q k ' 1 ' 

k k 

Here, A is a thermodynamic quantity, (A) w denotes generalized ensemble 
average of A, (A) k denotes canonical ensemble average. The potential energy 
probability density of the generalized ensemble Pw(U) is 

_ n{U)W{r) _ k 

Pw{U) ~ JwEdV " EnkQk ' (4) 

k 
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in which n(U) is the density of states, and Pk{U) is the potential energy 
probability density of the canonical ensemble at temperature Tj.. In a special 
case, if nfc = pf- (c is a nonzero constant), Eq. (Efl) becomes 

Pw(U) = k = -J2 PkW • (5) 

k k 

Importantly, the properties of any canonical ensemble whose temperature is 
in the desired range, i.e. Tj G [T 1? T/v] can be calculated by a reweighting 
scheme from the generalized ensemble by Eq. (E]) and Eq. (J7J): 

In Eq. ((7j), Pp^U) denotes the potential energy probability density at inverse 
temperature /3j and Qp. denotes the partition function of canonical ensemble 
at inverse temperature (3j. 

In ITS simulation, the generalized distribution function of Eq. ([1]) can be 
obtained by running a simulation with a modified potential U'(r) at desired 
temperature T. U(r) is defined through 



e -PU'{r) 



W(r) = J2 n ke~^ kU{r) , (8) 



and can be simply written as: 



U>(r) = ~ In J2n k e-^ u{r) . (9) 

' k 

The biased force Fb that is used in the Newtonian equations of motion with 
the modified potential U'(r) becomes 

r _ dU'(r)_ dU'(r)dU(r) _ \ kPkC p 



<9r 0tf(r) <9r ^n k e~^ u ^ 

k 
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In Eq. (fTUl) . F is the force calculated using original potential function of 
the system under study. To implement this ITS method in an MD software 
package, we only need to modify the integrator, which calculates the biased 
force by Eq. ([TO]) , leaving other software codes such as subroutines for force 
calculation unchanged. Therefore, the ITS method supplies an easy and 
efficient way to scan a larger span of energy distribution. 

2.2. How to determine n k and ft k 

The key issue in ITS method is how to determine the weighting factors 



n k . In original ITS method [22[, to calculate n k , m k is defined as 



1 k — 1 

m k = fc > l ' ( n ) 

n k -l 

so n k can be obtained by the product of rrik, 

k 

n k = n 1 Y[ni j , (12) 

3=1 

and P k on is defined as product of n k and Q k , 

Pt on = n k Q k = n k j e-W^dr . (13) 

In practice, a set of initial guess of m k is made, then short ITS simulations are 
performed and m k are updated in an iterative way to make P k on of adjacent 
temperatures equal. Values of n k are determined by Eq. (fT2]) and the target 
values of n k are simply ^- (c is a nonzero constant). 

In this study, we propose an alternative way to get the values of n k quickly, 
easily and without an iteration. First, we define the energy U k (shown in 
Fig. UJ (a)), at which the values of two adjacent terms in W(r) are equal. It 
gives 

n k e-^ u " = n k+1 e-^ u " . (14) 



We can easily obtain the expression of U% as 



uP = \nn k - jnwfc+i 

fik — Pk+1 
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As mentioned before, terms in W(r) are ranked as temperature increases. 
According to mathematical property of exponential function, U% increases 
with increasing temperature: 

Ul<U*<...<Ul<...<Ul_ 1 . (16) 

This sequence divides the energy into N ranges. Provided that energy U is 
in the range U^_ x < U < [/£, the k-th term in W(r) is the largest one (as 
illustrated in Fig. OD (a)): 

n ie -^ u < n 2 e-^ u <...< n k e~^ u > ...> n N e~^ u . (17) 

If we define weighting functions by 



there is a maximum of weighting function in the range U^_ 1 < U < U%. 
The value of normally decays rapidly as energy U varies. In other ranges, 
the value of could be rather small even negligible. This property indicates 
that in the range U%_ 1 < U < U%, the value of W(r) is dominated by its 
fc-th term, and the property of generalized ensemble resembles the canonical 
ensemble at temperature T k . 

We then define energy (shown in Fig.[T](b)) meeting the condition that 
the potential energy probability density function of the canonical ensemble 
at temperature T k is equal to that of the canonical ensemble at temperature 
Tfc + i, that is, ideally we have 

P k (Ul) = P k+1 (Ul) . (19) 

Eq. (fT9~j) can be written as: 

n{U q k )e-^ u l n{Ul)e-^ u k 



Qk Qk+1 

Then, we can get the expression of U\ as 

_ In Q k+1 - In Q k 



(20) 



Xji = _ . (21) 
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Provided that the potential energy average of the system will increase as the 
temperature increases, U\ also increases as temperature increases, i.e. Uf < 
U% < ... <Ul < ... < U%_ v And similarly in the range Ul_ x < U < U q k , 
there is a maximum for function Pk(U). 

To optimize the energy distribution generated in ITS simulation, when 
W(r) is dominated by the k-th term in the range U%_i < U < U%, the 
maximum of the potential energy probability density function should be in 
the same range, that is 

Ut = Ul . (22) 
If we substitute Eq. (IT51) and Eq. (|2"2"|) into Eq. (|2~T1) . we can conclude that 

Uh = Wk ' (23) 

Eq. f[2"5j) is consistent with the result reported in original ITS method, and it 
indicates that the optimizin g co ndition we present here is essentially identical 



to the way proposed in Ref. 22J. If we only substitute Eq. (Tl5|) into Eq. (T52 



we obtain the recursive relation of n k . 

\nn k -\nn k+1 = U q k (/3 k - f3 k+1 ) . (24) 
In Eq. (|24|) . n\ can be simply set to 1 and U k can be estimated in the following 



way 



= .nQ^-lnft ^ l^^M., = l ((f/)t+(f/)t+l) . (25 ) 

Pk — Pk+1 * OPk OPk+l 4 



In Eq. (1251) . the slope of a secant line is approximated by average of the slopes 
of tangent lines at two line terminals. The potential energy averages can 
be evaluated through conventional MD simulations. According to Eq. (T24l) 
and Eq. (T25]) . we can easily determine the values of n k one by one without 
estimating the partition functions. 

The temperature distribution is crucial to the energy distribution gen- 
erated in ITS simulation. It seriously affects the efficiency of ITS method. 
Here we also propose an easy way to determine a reasonable temperature 
distribution, which can actually be determined by the requirement that the 
ratio between energy probability density functions at two adjacent tempera- 
tures is a constant t when the energy is equal to the potential energy average 
at the lower temperature, 
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In Eq. (I26p . the parameter t is named overlap factor, which is related to the 
space between two adjacent temperatures and the total number of tempera- 
tures in the desired temperature range. Eq. fl26|) can be rewritten as 

n((U) k )e-Pk(u)k 

t. (27) 



n{{U) k )e- p k+^U)k 

Qk + l 



Through simple derivation, one can get the recursive relation of inverse tem- 
perature, 

ft " A+i = ut l {u)k ■ (28) 

Eq. (1281) contains only one adjustable parameter t. Once the overlap factor t 
is determined, the temperature distribution will be completely determined. 

Because the idea of ITS method is quite similar to that of replica exchange 
method 0, \^ , we then choose the value of overlap factor t by comparing ITS 
to REM. REM is based on simultaneous simulations of multiple replicas of 
the same system at different temperatures. At regular intervals, N indepen- 
dent simulations are allowed to switch temperatures with each other with the 
acceptance ratio defined in Eq. (1291) . In this way, it is possible for low tem- 
perature replicas to gradually migrate up to higher temperatures and back 
again. 

P aC c(U k , (3 k U k+1 , p k+1 ) = min{l, e ^~M^ + ^)} . (29) 

In Eq. (129"]) . U k and Uk+i are potential energies at temperatures T k and 
Tfc +1 , respectively. For efficient REM simulations, the choice of temperatures 
should guarantee sufficient overlap between all adjacent pairs over the en- 
tire temperature range and give the same mean acceptance ratio between 
those adjacent pairs. Various approaches to optimize the temperature dis- 
tribution of REM simulations had been proposed. Sanbonmatsu and Garcia 
performed short simulations at a few temperatures, then fitted average ener- 
gies with polynomial and determined the temperature distribution by solving 



Eq. (12"9"|) in an iterative way [23(. de Pablo and coworkers presented a simi- 
lar approach and demonstrated that under the assumption that the energy 
probability density function is Gaussian, the relation between acceptance ra- 
tio P acc and the overlap of energy probability density function at two adjacent 



temperatures is system independent [24 . 
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For ITS method, substituting Eq. into Eq. ([25)1 . we get: 



a -^ i= m^W (30) 

so, 

e Vh+i-Pk)mk+i-(v)k) = r 2 . (31) 

The left side of Eq. ( 131]) is the mean acceptance ratio in REM simulations. 
Suppose that if a set of temperatures could give a reasonable acceptance 
ratio in REM simulations, there should be enough overlap between adjacent 
temperatures, and this set of temperatures would also work in ITS simulation. 
Thus, giving the left side of Eq. ( l3Tj) a proper value in the range of ~ 1 
will determine the value of overlap factor t. Then the complete temperature 
distribution is determined further by using Eq. fl28|) . 



2.3. Computational procedure 

We propose a new computational procedure of ITS simulation. 

1. Determine the desired temperature range. 

2. Choose a set of temperatures in the desired range and generate short 
replica exchange simulation trajectories to calculate the potential en- 
ergy averages of the system at those temperatures. For simple systems, 
conventional MD or MC simulations can also be used. Then the rela- 
tion between potential energy average and temperature is obtained by 
interpolation. 

3. Determine the ITS temperature distribution and the corresponding 
weighting factors rik through Eq. (124)) . Eq. ( 125|) and Eq. (128]) . 

4. Use the parameters generated in step |3] to perform ITS simulation, 
which is essentially a conventional MD simulation using biased force 
calculated by Eq. (ITU]) . 

5. After ITS simulation, the canonical ensemble properties can be calcu- 
lated by Eq. © and Eq. ©. 

3. Applications 

3.1. Lennard- J ones fluid 



Lennard- Jones (LJ) fluid is a widely used benchmark system 25[. To test 



the validity of this ITS method, we consider the LJ fluid system reported in 
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Ref. 26[ and compare our results with literature data. The LJ potential is 



UUr)=HC-) l2 -^n (32) 

The system contains 864 particles. In our simulations for LJ particles, 
conventional reduced units are used. The number density is therefore p = 0.8 
and the cutoff distance is r r = 4.0. Integration timestep of 0.001 is used. 



Long range correction U ta u [27j is applied with 



U tail = ^p[(-) 9 - 3(-) 3 ] . (33) 

First, we perform a set of conventional (canonical) MD simulations at dif- 
ferent temperatures to obtain the potential energy versus temperature curve. 
Because our purpose is to test the validity of our newly-proposed ITS pro- 
cedure, we try to obtain the potential energy curve as accurate as possible. 
So 15 temperatures are chosen in the range of 1.4 ~ 2.0 for this purpose. 
At each temperature, we run 1 x 10 6 steps canonical ensemble simulation to 
calculate the potential energy average after equilibrium. Linear interpolation 
is applied to obtain the potential energy average in the desired temperature 
range. Then the temperature distribution fa is obtained by solving Eq. (T28|) . 
The overlap factor t is set to e 0,5 , which generates 9 temperatures in the 
range of 1.4 ~ 1.91. The corresponding weighting factors are determined 
using Eq. (I24p . After that, We perform 1 x 10 7 steps ITS simulation and 
calculate the canonical average of potential energy at three different temper- 
atures through reweighting scheme using Eq. ([6]). The results are shown in 
TABLE [TJ The potential energies per particle at different temperatures are 



in good agreement with the literature data [26j. These results indicate that 
our newly proposed ITS method is validated. 

We also calculate the potential energy probability densities of canonical 
ensembles at T 5 and T 6 through reweighting scheme using Eq. (IT]). These two 
temperatures are chosen because they are in the middle of the temperature 
range between 1.4 ~ 1.91. The results are shown in Fig. |2J We can see that 
the two curves overlap largely. Moreover, the value of £7| read from Fig. [2] 
(the cross point of the two curves) is —4227.7, in agreement with the value 
estimated by Eq. ( 125]) . which is used in the ITS simulation, —4225.2. There- 
fore, our method can ensure sufficient energy distribution overlap between 
adjacent temperatures, and the approximation method used in Eq. (125]) yields 
a good accuracy. 
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3.2. ALA-PRO peptide in implicit solvent 

We then apply this ITS method to study the trans/cis transition of ALA- 
PRO peptide. Real units are used in the following. For comparison, we also 
use replica exchange molecular dynamics (REMD) and conventional MD to 
study the conformational transition of this peptide. 

The structure of ALA-PRO peptide is shown in Fig. [3J The dihedral 
angle u indicated in Fig. [3] can be defined as the reaction coordinate of 
trans/cis transition of this peptide. In our simulations, we use modified 



GROMACS 4.5.5 package |28| and AMBER 99sb force field [29|. Generalized 



is 



Born solvent accessible surface area (GBSA) implicit solvent model [30 
adopted. The LINCS 3l| algorithm is used to constrain all bonds containing 
hydrogen atom. In all simulations, the integration timestep is set as 1 fs. We 
perform 1 /xs simulations using ITS, REMD and conventional MD methods, 
respectively. In REMD simulation, eight replicas with temperatures at 283, 
335, 400, 478, 564, 680, 805, and 905 K are used, which result in an exchange 
acceptance ratio of roughly 50%. The exchange attempt frequency is 1 ps -1 
in REMD simulation. For ITS, REMD and conventional MD simulations, 
the same initial structure is taken. 

To test the robustness of ITS method, the potential energy average curve 
is obtained from the first 10 ps REMD simulations (averaging over 10 frames), 
which is apparently quite approximate to estimate n^. In our ITS simulation, 
the overlap factor t is set to e ' 05 , which generates 22 temperatures in the 
range from 283.0 to 948.82 K. The potential energy averages calculated from 
10 ps, 1 /is REMD simulations and ITS simulation are shown in TABLE |2j 
It is clear that the potential energy averages calculated from 10 ps trajec- 
tory of REMD simulation are not accurate enough, whereas potential energy 
averages calculated from 1 /is REMD and ITS simulation are in good agree- 
ment. The visited potential energies in ITS, REMD, and conventional MD 
simulations are shown in Fig. HI ITS method can effectively explore a broad 
range of potential energy as REMD method, whereas the conventional MD 
can only explore a limited potential energy range. An important thing is, 
although the values are obtained from potential energy averages without 
enough accuracy (i.e., only from 10 ps REMD trajectories), the ITS method 
is still quite efficient on exploring the configuration space. It implies that for 
estimating nk, several short simulations at different temperatures are enough. 
We denote (U)k and t/| obtained from short simulations as (U) s k and 
the true values of ([/% and U% are denoted as (U)\ and Uf, respectively. 
The optimizing condition of potential energy distribution is that when W(r) 
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is dominated by the k-th term in the range U%_i < U < U%, the maximum 
of the potential energy probability density should be in the same range. In 
fact, this optimizing condition can be achieved when the following condition 
is satisfied: 

K < (U)Ui < U q k li(k = 1, 2, • ■ ■ , N - 2) . (34) 

Thus, the potential energy averages obtained from short simulations can vary 
in a quite wide range and are not necessary to be that accurate. 

To clarify the influence of overlap factor t on ITS simulations, we also try 
different values of overlap factor t for this dipeptide system. The standard 
deviation of potential energy, ad, is employed to characterize the width of 
the energy range visited in the ITS simulations: 

= VV> - (Uf • (35) 

The results are shown in TABLE [3j When t is set to e 29 8 , the overlap 
between adjacent canonical ensembles is very small, thus the energies of 
the system under study are trapped in a narrow range. As t decreases, 
the system can visit a broad range of energies. Fig. |5] shows the potential 
energy distribution generated by t = e 5 ' (corresponds to 3 temperatures) 
and t = e 05 (corresponds to 22 temperatures). Because the overlap between 
the 3 temperatures (when t = e 5,0 ) is insufficient, there are obviously 3 peaks 
in the potential energy distribution, and the peak corresponding to the lowest 
temperature is very high, indicating a low sampling efficiency of high energy 
range. While for t = e 05 (22 temperatures), a more uniform potential energy 
distribution is generated. A good choice of overlap factor therefore should 
ensure sufficient overlap between adjacent temperatures. 

To illustrate the sampling efficiency in configuration space, we compare 
the dihedral angle (lj) distributions obtained in ITS, REMD and conventional 
MD simulations, as shown in Fig. |6j Because the free energy barrier is pretty 
high for trans/cis transition of this dipeptide, in conventional MD simulation, 
no transition occurs and a unimodal u distribution is observed. While in ITS 
as well as in REMD simulations, bimodal distributions of u are observed. The 
result indicates that both ITS and REMD methods can overcome high free 
energy barrier and enhance the sampling of configuration space. 

To further compare the sampling efficiency of ITS and REMD methods, 
root of mean square derivation (RMSD) of potential of mean force (PMF) 
along the reaction coordinate (cu) is investigated. PMF along the dihedral 
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angle (cu) is denned as 



F pm f{uo) = -k B Tln (p(co)) . (36) 
In Eq. f[3"6"j) . (p(uj)) is average density function denned in Eq. ( 13T|) 

Vjr) 

i , nv J 5(w (r) - w)e fc B T (ir 

(pH) = , (37) 

J e k B T dr 

which can be calculated through a reweighting scheme using Eq. ([6]). Fig. [7] 
shows the time evolution of RMSD of PMF. The RMSD of PMF converges 
much more quickly in ITS simulation than that in REMD simulation. The 
result implies that ITS method is more efficient than REMD in sampling 
configuration and energy space. 

Another important advantage of ITS simulation is that it requires less 
computational resources. It is necessary in REMD simulation to launch sev- 
eral simulations simultaneously, while in ITS simulation, only one trajectory 
is needed. The computational resources required by ITS is almost the same 
as conventional MD simulation. In this dipeptide case, the CPU time for 
REMD simulation is about 263 hours (eight trajectories in total) and for 
ITS simulation is about 35 hours (only one trajectory is needed). For this 
simplest peptide system, the REMD simulation is nearly 8 times computa- 
tional expensive. 

3.3. Coil-globule transition of a flexible single polymer chain 

The transition of a flexible polymer chain from a random-coil conforma- 



tion to a globular compact form has been extensively studied [32j, [33j, [34 
In order to demonstrate the applicability of this ITS method, we apply it 
to study the coil-globule transition of a flexible single polymer chain in im- 
plicit solvent. By calculating the mean square radius of gyration ((Rg 2 )) 
of the polymer chain at different temperatures, we can get the transition 
temperature of coil-globule transition. 

Conventional reduced units are used in the following. We consider a 
coarse-grained model of polymer with 100 beads connected by finite extensi- 
ble nonlinear elastic (FENE) potential, 

U FENE (r) = ln(L ° " & + UWCA r<r ° , (38) 

oo r > r 
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in which 



Uwca(t) 



±e W cA (™) -(™) +e WC A 



r < 2&a WC A 
r > 2^a WC A 







(39) 



and r is the bond extension parameter, K is the attractive force strength, 
and r is the instantaneous bond length. We set <tvi/cm = 1-05, Swca=^-^, 
ro=1.5, and K=20. The Lennard- Jones potential in Eq. ( )32|) is used between 
non-bonded beads with a =1.0 and e—1.0. For comparison, we perform both 
ITS and MD simulations with the same initial chain configuration and the 
same parameter set. 

The potential energy curve is obtained by 1.0 x 10 6 steps MD simulation at 
8 temperatures in the range of 1.0 ~ 4.5. We then perform 1.0 x 10 9 steps ITS 
simulation, for which the overlap factor t is set to e a5 and 18 temperatures 
are generated in the range of 1.0 ~ 4.35. By performing one ITS simulation, 
we can get the (Rg 2 ) at any temperature in the temperature range. By calcu- 
lating the first order derivative, the transition temperature can be identified, 
as shown in Fig. [8] for (Rg 2 ) at 300 temperatures. As temperature increases, 
the value of (Rg 2 ) also increases. Because the curve for (Rg 2 ) produced by 
ITS simulation is smooth with high revolution, we calculate first order and 
second order derivative directly by difference method. The peak of the first 
order derivative of (Rg 2 ) ~ T corresponds to the transition temperature. 
By calculating the second order derivative, we easily identify the transition 
temperature as 2.42 for coil-globule transition of a polymer chain with 100 
beads. 

For comparison, we also use brute-force canonical MD simulations to 
study the coil-globule transition of this polymer chain. We perform 31 MD 
simulations at 31 temperatures spaced 0.1 in the range of 1.0 ~ 4.0. The 
length of each simulation is 1.0 x 10 9 timesteps. As shown in Fig. [HJ the 
curve of (Rg 2 ) ~ T produced by canonical MD simulations basically overlaps 
with the one produced by ITS simulation. Because the resolution of the 
curve produced by MD simulations is low and the noise of data is significant, 
we employ polynomial fitting method to calculate the derivative, we try 
different orders of polynomial, and find that the 8-order polynomial can best 
reproduce the results of ITS simulation. Because the computational cost of 
ITS simulation is nearly the same as conventional MD simulation and we can 
get even more accurate data with high resolution in one ITS simulation than 
massive canonical MD simulations, ITS method is much more efficient than 



14 



conventional MD on identifying the coil-globule transition temperature for 
polymer chain. The results indicate that ITS method can be used to study 
complex polymer systems with high efficiency and accuracy. 



4. Conclusion 

In this study, we present a new version of ITS method that provides an 
easy, quick and robust way to generate suitable parameters. In this method, 
the only input is potential energy average in the desired range of temper- 
atures. Reasonable values of weighting factors can be obtained directly 
from potential energy average without iteration, even though the potential 
energy average is not that accurate. It is also easy to determine the tem- 
perature distribution in which there is sufficient overlap between adjacent 
temperatures by choosing a reasonable overlap factor. This method is very 
efficient for exploring configuration space and calculating thermodynamic 
quantities. By running one ITS simulation (i.e., one trajectory), we can 
sample basically the same configuration space as REMD simulations in the 
same temperature range. But in ITS method, we do not need to launch 
tens of parallel simulations simultaneously, so it is extremely suitable to be 
implemented in GPU version of typical simulation packages for enhanced 
sampling. 

In the method we proposed, to determine the weighting factors in ITS 
simulation, we use the optimizing condition that when W(r) is dominated 
by the fc-th term in the range If^-i <U < £/£, the maximum of the potential 
energy probability density should be in the same range. This optimizing can 
be easily satisfied even the potential averages are not very accurate, so we 
could estimate these parameters by short MD simulation trajectories. 

Glass transition is fundamental and challenging problem in solid state 
physics and also an important phenomenon in material science. The debate 
about whether the glass transition is a thermodynamic phase transition or a 



dynamic phenomenon has been lasting for decades |35l . l36l l37j . Because this 
ITS method is a powerful tool to calculate the thermodynamic quantities, we 
hope this method will contribute to solving the problem of glass transition. 
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Table 1: Comparison of potential energy per particle (in 
reduced unit) of Lennard- Jones fluid in ITS simulation and 
from literature. 



Temperature ITS results Literature data [26 



1.4 -5.199 -5.199 

1.6 -5.045 -5.046 

1.8 -4.895 -4.896 
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Table 2: Potential energy averages of ALA-PRO dipeptide calculated from 10 ps 
and 1 [is trajectories, respectively, and ITS simulation at different temperatures. 



Temperature (K) 10 ps (KJ/mol) 1 /is (KJ/mol) ITS (KJ/mol) 



283 


-487.83 


-491.93 


-491.94 


335 


-480.13 


-478.51 


-478.55 


400 


-459.46 


-461.82 


-461.88 


478 


-439.46 


-441.92 


-442.03 


564 


-414.80 


-420.18 


-420.33 


680 


-369.86 


-391.09 


-391.35 


805 


-345.94 


-359.96 


-360.32 


950 


-301.71 


-324.01 


-324.30 
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Table 3: Different overlap factor t, number of temperatures, standard deviation of 
potential energy. 



Overlap 


Number 


Standard deviation of potential 


factor t 


of temperatures 


energy (KJ/mol) 


e 29.8 


2 


17.10 


e 5.0 


3 


39.71 


e 0.5 


8 


53.24 


e 0.05 


22 


56.40 


conventional MD 


1 


32.9 
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Figure 1: Schematic illustration of U% and {/?. (a) U% is defined as the energy at which 
the values of two adjacent terms in W(r) are equal. In the range of U^_ 1 < U < U%, 
the fc-th term in W(r) is the largest one. (b) U% is defined as the energy at which the 
potential energy probability density functions of the canonical ensembles at temperature 
Tk and at temperature Tk+i are equal. 
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Figure 2: The potential energy probability densities of canonical ensembles at temperature 
T5 (blue square) and Tq (red circle). 
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Figure 3: The structure of ALA-PRO peptide. The dihedral angle u> is defined as the 
reaction coordinate. 
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Figure 4: Potential energies visited in ITS, REMD and conventional MD simulations, (a) 
ITS (green triangle); (b) REMD ( red circle); (c) conventional MD (black square). 
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Figure 6: The distribution of the dihedral angle lo in ITS simulation (red circle), REMD 
simulation (green triangle) and conventional MD simulation (black square). 
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Figure 7: The convergence of RMSD of PMF in ITS simulation (red circle) and REMD 
simulation (green triangle), respectively. It is clear that the RMSD of PMF converges 
much more quickly in ITS simulation than that in REMD simulation. 
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Temperature (reduced unit) 



Figure 8: The radius of gyration (a) and its derivative (b) versus temperature in ITS 
simulation (red circle) and MD simulations (black square) . The derivative curve for ITS is 
computed by difference method and the derivative curve for MD is computed by 8-ordcr 
polynomial fitting. 
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